Rank in Wordlist | Frequency | Word |
---|---|---|
4372 | 40 | 1,5 |
6848 | 22 | 3,5 |
8434 | 17 | 2,5 |
10414 | 13 | 4,5 |
12592 | 10 | 1,7 |
13557 | 9 | 1,2 |
13558 | 9 | 1,6 |
14728 | 8 | 1,4 |
14779 | 8 | 3,2 |
16184 | 7 | 0,5 |
Rank in Wordlist | Frequency | Word |
---|---|---|
4942 | 34 | 50% |
5173 | 32 | 10% |
5865 | 27 | 90% |
6394 | 24 | 40% |
6623 | 23 | 20% |
7381 | 20 | 100% |
7390 | 20 | 25% |
7735 | 19 | 60% |
8438 | 17 | 30% |
8440 | 17 | 5% |
Rank in Wordlist | Frequency | Word |
---|---|---|
7509 | 20 | R&B |
28956 | 3 | A&W |
28986 | 3 | AT&T |
33956 | 3 | W&P |
38037 | 2 | AR&Co |
39713 | 2 | C&S/Sovran |
40907 | 2 | E&M |
41967 | 2 | H&E |
46230 | 2 | PP&K |
61925 | 1 | 6:26&32 |
Rank in Wordlist | Frequency | Word |
---|---|---|
38049 | 2 | AS$58,1 |
50000 | 2 | US$/ |
50001 | 2 | US$100 |
50002 | 2 | US$350 |
50003 | 2 | US$70 |
58623 | 1 | 18-$20 |
59147 | 1 | 1:000$000 |
63092 | 1 | AS$0,8 |
63093 | 1 | AS$1 |
63094 | 1 | AS$1,1 |
Rank in Wordlist | Frequency | Word |
---|---|---|
217 | 725 | ." |
Rank in Wordlist | Frequency | Word |
---|---|---|
5870 | 27 | Al-Qur'an |
8845 | 16 | .' |
8927 | 16 | Don't |
11922 | 11 | I'm |
12081 | 11 | Qur'an |
12783 | 10 | Girls' Generation |
13823 | 9 | I'll |
14058 | 9 | Queen's |
14180 | 9 | Women's |
15043 | 8 | It's |
Rank in Wordlist | Frequency | Word |
---|---|---|
29660 | 3 | CD+DVD |
59269 | 1 | 2-6-0+0-6-2 |
60198 | 1 | 293+800 |
60255 | 1 | 3+1 |
60256 | 1 | 3+2 |
60257 | 1 | 3+3 |
61651 | 1 | 6+6 |
61944 | 1 | 7+1 |
62250 | 1 | 8+4 |
73944 | 1 | Ctrl+KC |
Rank in Wordlist | Frequency | Word |
---|---|---|
3719 | 49 | jiwa/km² |
6111 | 26 | dan/atau |
7629 | 20 | km/jam |
8804 | 17 | s/d |
10538 | 13 | Kabupaten/Kota |
10799 | 13 | https://www |
10814 | 13 | kabupaten/kota |
15817 | 8 | km/h |
17750 | 7 | mm/tahun |
18359 | 6 | DI/TII |
In the last subsection of this type we look for words containing other special characters: , ( ) % & $
" ' + * = / _
Depending on the language some of these characters may be allowed within words, other will not. If words with forbidden characters do not have very low frequency there might be a problem in preprocessing.
Words containing %:
select w_id-100,freq, word from words where w_id>100 and word like "%\%%" limit 10;
3.12.1 Words with Hyphens
3.12.2 Multiwords
3.12.3 (Multi-)Words with dots